sexual content
Musk's Grok to bar users from generating sexual images of real people
Musk's Grok to bar users from generating sexual images of real people Elon Musk's X has said it will "geoblock" users of xAI Grok from creating images of people in "bikinis, underwear, and similar attire" amid a global backlash against the chatbot's sexualised images. "We have implemented technological measures to prevent the Grok account from allowing the editing of images of real people in revealing clothing such as bikinis," X's safety team said in a statement late on Wednesday. The statement did not elaborate on the nature of the geoblocking or other safeguards. X claimed to have "zero tolerance for any forms of child sexual exploitation, non-consensual nudity, and unwanted sexual content". X's Grok faces investigations and bans from regulators and governments around the world following a deluge of sexualised AI images on the platform in recent weeks.
- Europe > United Kingdom (0.16)
- Asia > Malaysia (0.07)
- Asia > Indonesia (0.07)
- (10 more...)
- Law (1.00)
- Government > Regional Government (0.51)
Why Are Grok and X Still Available in App Stores?
Why Are Grok and X Still Available in App Stores? Elon Musk's chatbot has been used to generate thousands of sexualized images of adults and apparent minors. Apple and Google have removed other "nudify" apps--but continue to host X and Grok. Elon Musk's AI chatbot Grok is being used to flood X with thousands of sexualized images of adults and apparent minors wearing minimal clothing. Some of this content appears to not only violate X's own policies, which prohibit sharing illegal content such as child sexual abuse material (CSAM), but may also violate the guidelines of Apple's App Store and the Google Play store.
- North America > United States > California (0.15)
- South America > Venezuela (0.05)
- Europe > Slovakia (0.05)
- (3 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.70)
UK to ban deepfake AI 'nudification' apps
The UK government says it will ban so-called nudification apps as part of efforts to tackle misogyny online. New laws - announced on Thursday as part of a wider strategy to halve violence against women and girls - will make it illegal to create and supply AI tools letting users edit images to seemingly remove someone's clothing. The new offences would build on existing rules around sexually explicit deepfakes and intimate image abuse, the government said. Women and girls deserve to be safe online as well as offline, said Technology Secretary Liz Kendall. We will not stand by while technology is weaponised to abuse, humiliate and exploit them through the creation of non-consensual sexually explicit deepfakes.
- North America > United States (0.16)
- North America > Central America (0.15)
- Oceania > Australia (0.06)
- (15 more...)
- Law (1.00)
- Information Technology > Security & Privacy (0.95)
- Government > Regional Government > Europe Government > United Kingdom Government (0.69)
Why Disney's Most Scandalous Deal Is Such a Grim Development
The Industry Disney's Deal With OpenAI Is So Much Worse Than You Think The $1 billion partnership allows users to create A.I.-generated images of the company's iconic characters. That's not going to end well for anyone. Enter your email to receive alerts for this author. You can manage your newsletter subscriptions at any time. You're already subscribed to the aa_Nitish_Pahwa newsletter.
- North America > United States > California (0.15)
- North America > United States > New York (0.05)
- Information Technology (1.00)
- Government (1.00)
- Law > Intellectual Property & Technology Law (0.70)
- Media > Film (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.77)
NDM: A Noise-driven Detection and Mitigation Framework against Implicit Sexual Intentions in Text-to-Image Generation
Sun, Yitong, Huang, Yao, Zhang, Ruochen, Chen, Huanran, Ruan, Shouwei, Duan, Ranjie, Wei, Xingxing
Despite the impressive generative capabilities of text-to-image (T2I) diffusion models, they remain vulnerable to generating inappropriate content, especially when confronted with implicit sexual prompts. Unlike explicit harmful prompts, these subtle cues, often disguised as seemingly benign terms, can unexpectedly trigger sexual content due to underlying model biases, raising significant ethical concerns. However, existing detection methods are primarily designed to identify explicit sexual content and therefore struggle to detect these implicit cues. Fine-tuning approaches, while effective to some extent, risk degrading the model's generative quality, creating an undesirable trade-off. To address this, we propose NDM, the first noise-driven detection and mitigation framework, which could detect and mitigate implicit malicious intention in T2I generation while preserving the model's original generative capabilities. Specifically, we introduce two key innovations: first, we leverage the separability of early-stage predicted noise to develop a noise-based detection method that could identify malicious content with high accuracy and efficiency; second, we propose a noise-enhanced adaptive negative guidance mechanism that could optimize the initial noise by suppressing the prominent region's attention, thereby enhancing the effectiveness of adaptive negative guidance for sexual mitigation. Experimentally, we validate NDM on both natural and adversarial datasets, demonstrating its superior performance over existing SOTA methods, including SLD, UCE, and RECE, etc. Code and resources are available at https://github.com/lorraine021/NDM.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Research Report (0.82)
- Workflow (0.68)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators
Hawkins, Will, Russell, Chris, Mittelstadt, Brent
Advances in multimodal machine learning have made text-to-image (T2I) models increasingly accessible and popular. However, T2I models introduce risks such as the generation of non-consensual depictions of identifiable individuals, otherwise known as deepfakes. This paper presents an empirical study exploring the accessibility of deepfake model variants online. Through a metadata analysis of thousands of publicly downloadable model variants on two popular repositories, Hugging Face and Civitai, we demonstrate a huge rise in easily accessible deepfake models. Almost 35,000 examples of publicly downloadable deepfake model variants are identified, primarily hosted on Civitai. These deepfake models have been downloaded almost 15 million times since November 2022, with the models targeting a range of individuals from global celebrities to Instagram users with under 10,000 followers. Both Stable Diffusion and Flux models are used for the creation of deepfake models, with 96% of these targeting women and many signalling intent to generate non-consensual intimate imagery (NCII). Deepfake model variants are often created via the parameter-efficient fine-tuning technique known as low rank adaptation (LoRA), requiring as few as 20 images, 24GB VRAM, and 15 minutes of time, making this process widely accessible via consumer-grade computers. Despite these models violating the Terms of Service of hosting platforms, and regulation seeking to prevent dissemination, these results emphasise the pressing need for greater action to be taken against the creation of deepfakes and NCII.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Greece > Attica > Athens (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Overview (0.93)
- Research Report > New Finding (0.93)
Gender and content bias in Large Language Models: a case study on Google Gemini 2.0 Flash Experimental
This study evaluates the biases in Gemini 2.0 Flash Experimental, a state-of-the-art large language model (LLM) developed by Google, focusing on content moderation and gender disparities. By comparing its performance to ChatGPT-4o, examined in a previous work of the author, the analysis highlights some differences in ethical moderation practices. Gemini 2.0 demonstrates reduced gender bias, notably with female-specific prompts achieving a substantial rise in acceptance rates compared to results obtained by ChatGPT-4o. It adopts a more permissive stance toward sexual content and maintains relatively high acceptance rates for violent prompts, including gender-specific cases. Despite these changes, whether they constitute an improvement is debatable. While gender bias has been reduced, this reduction comes at the cost of permitting more violent content toward both males and females, potentially normalizing violence rather than mitigating harm. Male-specific prompts still generally receive higher acceptance rates than female-specific ones. These findings underscore the complexities of aligning AI systems with ethical standards, highlighting progress in reducing certain biases while raising concerns about the broader implications of the model's permissiveness. Ongoing refinements are essential to achieve moderation practices that ensure transparency, fairness, and inclusivity without amplifying harmful content.
- North America > United States > Texas (0.14)
- Europe > Ukraine (0.14)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety
Rouf, Rakeen, Bavalatti, Trupti, Ahmed, Osama, Potdar, Dhaval, Jawed, Faraz
This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For the definitive version, see 10.1109/ACCESS.2025.3539933. Disclaimer: This research involves topics that may include disturbing results. Any explicit content has been redacted, and potentially disturbing results have been presented in a neutral and anonymized manner to minimize emotional distress to the readers. Abstract --Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts and the quality, coverage, and distribution of harm types in the datasets. By highlighting the strengths and limitations of the datasets, this study enables researchers to find the most ...
- Europe > Austria > Vienna (0.14)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)
- (6 more...)
- Overview (1.00)
- Research Report > New Finding (0.67)
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
Chen, Zhaorun, Pinto, Francesco, Pan, Minzhou, Li, Bo
With the rise of generative AI and rapid growth of high-quality video generation, video guardrails have become more crucial than ever to ensure safety and security across platforms. Current video guardrails, however, are either overly simplistic, relying on pure classification models trained on simple policies with limited unsafe categories, which lack detailed explanations, or prompting multimodal large language models (MLLMs) with long safety guidelines, which are inefficient and impractical for guardrailing real-world content. To bridge this gap, we propose SafeWatch, an efficient MLLM-based video guardrail model designed to follow customized safety policies and provide multi-label video guardrail outputs with content-specific explanations in a zero-shot manner. In particular, unlike traditional MLLM-based guardrails that encode all safety policies autoregressively, causing inefficiency and bias, SafeWatch uniquely encodes each policy chunk in parallel and eliminates their position bias such that all policies are attended simultaneously with equal importance. In addition, to improve efficiency and accuracy, SafeWatch incorporates a policy-aware visual token pruning algorithm that adaptively selects the most relevant video tokens for each policy, discarding noisy or irrelevant information. This allows for more focused, policy-compliant guardrail with significantly reduced computational overhead. Considering the limitations of existing video guardrail benchmarks, we propose SafeWatch-Bench, a large-scale video guardrail benchmark comprising over 2M videos spanning six safety categories which covers over 30 tasks to ensure a comprehensive coverage of all potential safety scenarios. SafeWatch outperforms SOTA by 28.2% on SafeWatch-Bench, 13.6% on benchmarks, cuts costs by 10%, and delivers top-tier explanations validated by LLM and human reviews.
- South America (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World
Loneliness, or the lack of fulfilling relationships, significantly impacts a person's mental and physical well-being and is prevalent worldwide. Previous research suggests that large language models (LLMs) may help mitigate loneliness. However, we argue that the use of widespread LLMs like ChatGPT is more prevalent--and riskier, as they are not designed for this purpose. To explore this, we analysed user interactions with ChatGPT, particularly those outside of its marketed use as task-oriented assistant. In dialogues classified as lonely, users frequently (37%) sought advice or validation, and received good engagement. However, ChatGPT failed in sensitive scenarios, like responding appropriately to suicidal ideation or trauma. We also observed a 35% higher incidence of toxic content, with women being 22 times more likely to be targeted than men. Our findings underscore ethical and legal questions about this technology, and note risks like radicalisation or further isolation. We conclude with recommendations for research and industry to address loneliness.
- Europe > United Kingdom (0.28)
- North America > United States > New York > New York County > New York City (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.66)
- Law (1.00)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)